Overview

Dataset statistics

 Dataset ADataset B
Number of variables1212
Number of observations446446
Missing cells439423
Missing cells (%)8.2%7.9%
Duplicate rows00
Duplicate rows (%)0.0%0.0%
Total size in memory45.3 KiB45.3 KiB
Average record size in memory104.0 B104.0 B

Variable types

 Dataset ADataset B
Numeric55
Categorical44
Text33

Alerts

Dataset ADataset B
Fare is highly overall correlated with PclassAlert not present in this datasetHigh correlation
Pclass is highly overall correlated with FareAlert not present in this datasetHigh correlation
Sex is highly overall correlated with Survived Sex is highly overall correlated with SurvivedHigh correlation
Survived is highly overall correlated with Sex Survived is highly overall correlated with SexHigh correlation
Age has 88 (19.7%) missing valuesAge has 88 (19.7%) missing valuesMissing
Cabin has 350 (78.5%) missing valuesCabin has 334 (74.9%) missing valuesMissing
PassengerId has unique valuesPassengerId has unique valuesUnique
Name has unique valuesName has unique valuesUnique
SibSp has 293 (65.7%) zerosSibSp has 302 (67.7%) zerosZeros
Parch has 339 (76.0%) zerosParch has 332 (74.4%) zerosZeros
Fare has 8 (1.8%) zerosFare has 6 (1.3%) zerosZeros

Reproduction

 Dataset ADataset B
Analysis started2026-01-13 18:41:20.4865282026-01-13 18:41:22.579211
Analysis finished2026-01-13 18:41:22.5761812026-01-13 18:41:24.640934
Duration2.09 seconds2.06 seconds
Software versionydata-profiling v0.0.dev0ydata-profiling v0.0.dev0
Download configurationconfig.jsonconfig.json

Variables

PassengerId
Real number (ℝ)

 Dataset ADataset B
Distinct446446
Distinct (%)100.0%100.0%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean431.71076447.88117
 Dataset ADataset B
Minimum13
Maximum889891
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
2026-01-13T18:41:24.738178image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum13
5-th percentile44.545
Q1200.25212.25
median419447.5
Q3659.75667
95-th percentile831.25850.75
Maximum889891
Range888888
Interquartile range (IQR)459.5454.75

Descriptive statistics

 Dataset ADataset B
Standard deviation255.86813260.08882
Coefficient of variation (CV)0.592684150.58070944
Kurtosis-1.2297974-1.2253252
Mean431.71076447.88117
Median Absolute Deviation (MAD)229.5228
Skewness0.061066144-0.010888186
Sum192543199755
Variance65468.49867646.195
MonotonicityNot monotonicNot monotonic
2026-01-13T18:41:24.873984image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7261
 
0.2%
721
 
0.2%
7561
 
0.2%
8211
 
0.2%
8551
 
0.2%
2461
 
0.2%
6851
 
0.2%
6661
 
0.2%
641
 
0.2%
4031
 
0.2%
Other values (436)436
97.8%
ValueCountFrequency (%)
4441
 
0.2%
8351
 
0.2%
4241
 
0.2%
4961
 
0.2%
4991
 
0.2%
6421
 
0.2%
1851
 
0.2%
8571
 
0.2%
541
 
0.2%
6581
 
0.2%
Other values (436)436
97.8%
ValueCountFrequency (%)
11
0.2%
21
0.2%
31
0.2%
61
0.2%
71
0.2%
81
0.2%
101
0.2%
111
0.2%
121
0.2%
131
0.2%
ValueCountFrequency (%)
31
0.2%
41
0.2%
51
0.2%
81
0.2%
131
0.2%
151
0.2%
161
0.2%
171
0.2%
181
0.2%
191
0.2%
ValueCountFrequency (%)
31
0.2%
41
0.2%
51
0.2%
81
0.2%
131
0.2%
151
0.2%
161
0.2%
171
0.2%
181
0.2%
191
0.2%
ValueCountFrequency (%)
11
0.2%
21
0.2%
31
0.2%
61
0.2%
71
0.2%
81
0.2%
101
0.2%
111
0.2%
121
0.2%
131
0.2%

Survived
Categorical

 Dataset ADataset B
Distinct22
Distinct (%)0.4%0.4%
Missing00
Missing (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
0
284 
1
162 
0
269 
1
177 

Length

 Dataset ADataset B
Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

 Dataset ADataset B
Total characters446446
Distinct characters22
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st row01
2nd row00
3rd row10
4th row10
5th row00

Common Values

ValueCountFrequency (%)
0284
63.7%
1162
36.3%
ValueCountFrequency (%)
0269
60.3%
1177
39.7%

Length

2026-01-13T18:41:24.975349image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Dataset A

2026-01-13T18:41:25.026647image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:25.057549image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
0284
63.7%
1162
36.3%
ValueCountFrequency (%)
0269
60.3%
1177
39.7%

Most occurring characters

ValueCountFrequency (%)
0284
63.7%
1162
36.3%
ValueCountFrequency (%)
0269
60.3%
1177
39.7%

Most occurring categories

ValueCountFrequency (%)
(unknown)446
100.0%
ValueCountFrequency (%)
(unknown)446
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0284
63.7%
1162
36.3%
ValueCountFrequency (%)
0269
60.3%
1177
39.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown)446
100.0%
ValueCountFrequency (%)
(unknown)446
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0284
63.7%
1162
36.3%
ValueCountFrequency (%)
0269
60.3%
1177
39.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown)446
100.0%
ValueCountFrequency (%)
(unknown)446
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0284
63.7%
1162
36.3%
ValueCountFrequency (%)
0269
60.3%
1177
39.7%

Pclass
Categorical

 Dataset ADataset B
Distinct33
Distinct (%)0.7%0.7%
Missing00
Missing (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
3
252 
1
99 
2
95 
3
230 
1
124 
2
92 

Length

 Dataset ADataset B
Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

 Dataset ADataset B
Total characters446446
Distinct characters33
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st row32
2nd row33
3rd row23
4th row13
5th row21

Common Values

ValueCountFrequency (%)
3252
56.5%
199
 
22.2%
295
 
21.3%
ValueCountFrequency (%)
3230
51.6%
1124
27.8%
292
 
20.6%

Length

2026-01-13T18:41:25.115950image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Dataset A

2026-01-13T18:41:25.170017image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:25.209494image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
3252
56.5%
199
 
22.2%
295
 
21.3%
ValueCountFrequency (%)
3230
51.6%
1124
27.8%
292
 
20.6%

Most occurring characters

ValueCountFrequency (%)
3252
56.5%
199
 
22.2%
295
 
21.3%
ValueCountFrequency (%)
3230
51.6%
1124
27.8%
292
 
20.6%

Most occurring categories

ValueCountFrequency (%)
(unknown)446
100.0%
ValueCountFrequency (%)
(unknown)446
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
3252
56.5%
199
 
22.2%
295
 
21.3%
ValueCountFrequency (%)
3230
51.6%
1124
27.8%
292
 
20.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown)446
100.0%
ValueCountFrequency (%)
(unknown)446
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
3252
56.5%
199
 
22.2%
295
 
21.3%
ValueCountFrequency (%)
3230
51.6%
1124
27.8%
292
 
20.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown)446
100.0%
ValueCountFrequency (%)
(unknown)446
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
3252
56.5%
199
 
22.2%
295
 
21.3%
ValueCountFrequency (%)
3230
51.6%
1124
27.8%
292
 
20.6%

Name
['Text', 'Text']

 Dataset ADataset B
Distinct446446
Distinct (%)100.0%100.0%
Missing00
Missing (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
2026-01-13T18:41:25.468880image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length6767
Median length4748
Mean length27.25560527.165919
Min length1213

Characters and Unicode

 Dataset ADataset B
Total characters1215612116
Distinct characters6059
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique446446 ?
Unique (%)100.0%100.0%

Sample

 Dataset ADataset B
1st rowOreskovic, Mr. LukaReynaldo, Ms. Encarnacion
2nd rowGoodwin, Miss. Lillian AmyAllum, Mr. Owen George
3rd rowHamalainen, Master. ViljoDanbom, Mrs. Ernst Gilbert (Anna Sigrid Maria Brogren)
4th rowHays, Mrs. Charles Melville (Clara Jennings Gregg)Yousseff, Mr. Gerious
5th rowCarter, Mrs. Ernest Courtenay (Lilian Hughes)Allison, Mrs. Hudson J C (Bessie Waldo Daniels)
ValueCountFrequency (%)
mr266
 
14.6%
miss91
 
5.0%
mrs64
 
3.5%
william40
 
2.2%
john21
 
1.2%
master21
 
1.2%
henry20
 
1.1%
thomas14
 
0.8%
charles10
 
0.5%
george10
 
0.5%
Other values (903)1268
69.5%
ValueCountFrequency (%)
mr256
 
14.0%
miss87
 
4.8%
mrs73
 
4.0%
william27
 
1.5%
master22
 
1.2%
john22
 
1.2%
henry19
 
1.0%
george18
 
1.0%
james14
 
0.8%
charles13
 
0.7%
Other values (916)1279
69.9%
2026-01-13T18:41:25.743931image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1379
 
11.3%
r976
 
8.0%
e860
 
7.1%
a838
 
6.9%
i684
 
5.6%
n678
 
5.6%
s653
 
5.4%
M557
 
4.6%
l547
 
4.5%
o535
 
4.4%
Other values (50)4449
36.6%
ValueCountFrequency (%)
1386
 
11.4%
r1017
 
8.4%
e890
 
7.3%
a816
 
6.7%
n671
 
5.5%
i652
 
5.4%
s633
 
5.2%
M565
 
4.7%
l534
 
4.4%
o513
 
4.2%
Other values (49)4439
36.6%

Most occurring categories

ValueCountFrequency (%)
(unknown)12156
100.0%
ValueCountFrequency (%)
(unknown)12116
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1379
 
11.3%
r976
 
8.0%
e860
 
7.1%
a838
 
6.9%
i684
 
5.6%
n678
 
5.6%
s653
 
5.4%
M557
 
4.6%
l547
 
4.5%
o535
 
4.4%
Other values (50)4449
36.6%
ValueCountFrequency (%)
1386
 
11.4%
r1017
 
8.4%
e890
 
7.3%
a816
 
6.7%
n671
 
5.5%
i652
 
5.4%
s633
 
5.2%
M565
 
4.7%
l534
 
4.4%
o513
 
4.2%
Other values (49)4439
36.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown)12156
100.0%
ValueCountFrequency (%)
(unknown)12116
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1379
 
11.3%
r976
 
8.0%
e860
 
7.1%
a838
 
6.9%
i684
 
5.6%
n678
 
5.6%
s653
 
5.4%
M557
 
4.6%
l547
 
4.5%
o535
 
4.4%
Other values (50)4449
36.6%
ValueCountFrequency (%)
1386
 
11.4%
r1017
 
8.4%
e890
 
7.3%
a816
 
6.7%
n671
 
5.5%
i652
 
5.4%
s633
 
5.2%
M565
 
4.7%
l534
 
4.4%
o513
 
4.2%
Other values (49)4439
36.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown)12156
100.0%
ValueCountFrequency (%)
(unknown)12116
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1379
 
11.3%
r976
 
8.0%
e860
 
7.1%
a838
 
6.9%
i684
 
5.6%
n678
 
5.6%
s653
 
5.4%
M557
 
4.6%
l547
 
4.5%
o535
 
4.4%
Other values (50)4449
36.6%
ValueCountFrequency (%)
1386
 
11.4%
r1017
 
8.4%
e890
 
7.3%
a816
 
6.7%
n671
 
5.5%
i652
 
5.4%
s633
 
5.2%
M565
 
4.7%
l534
 
4.4%
o513
 
4.2%
Other values (49)4439
36.6%

Sex
Categorical

 Dataset ADataset B
Distinct22
Distinct (%)0.4%0.4%
Missing00
Missing (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
male
290 
female
156 
male
285 
female
161 

Length

 Dataset ADataset B
Max length66
Median length44
Mean length4.69955164.7219731
Min length44

Characters and Unicode

 Dataset ADataset B
Total characters20962106
Distinct characters55
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowmalefemale
2nd rowfemalemale
3rd rowmalefemale
4th rowfemalemale
5th rowfemalefemale

Common Values

ValueCountFrequency (%)
male290
65.0%
female156
35.0%
ValueCountFrequency (%)
male285
63.9%
female161
36.1%

Length

2026-01-13T18:41:25.831725image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Dataset A

2026-01-13T18:41:25.882957image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:25.914439image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
male290
65.0%
female156
35.0%
ValueCountFrequency (%)
male285
63.9%
female161
36.1%

Most occurring characters

ValueCountFrequency (%)
e602
28.7%
m446
21.3%
a446
21.3%
l446
21.3%
f156
 
7.4%
ValueCountFrequency (%)
e607
28.8%
m446
21.2%
a446
21.2%
l446
21.2%
f161
 
7.6%

Most occurring categories

ValueCountFrequency (%)
(unknown)2096
100.0%
ValueCountFrequency (%)
(unknown)2106
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e602
28.7%
m446
21.3%
a446
21.3%
l446
21.3%
f156
 
7.4%
ValueCountFrequency (%)
e607
28.8%
m446
21.2%
a446
21.2%
l446
21.2%
f161
 
7.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown)2096
100.0%
ValueCountFrequency (%)
(unknown)2106
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e602
28.7%
m446
21.3%
a446
21.3%
l446
21.3%
f156
 
7.4%
ValueCountFrequency (%)
e607
28.8%
m446
21.2%
a446
21.2%
l446
21.2%
f161
 
7.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown)2096
100.0%
ValueCountFrequency (%)
(unknown)2106
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e602
28.7%
m446
21.3%
a446
21.3%
l446
21.3%
f156
 
7.4%
ValueCountFrequency (%)
e607
28.8%
m446
21.2%
a446
21.2%
l446
21.2%
f161
 
7.6%

Age
Real number (ℝ)

 Dataset ADataset B
Distinct7477
Distinct (%)20.7%21.5%
Missing8888
Missing (%)19.7%19.7%
Infinite00
Infinite (%)0.0%0.0%
Mean28.68343630.131061
 Dataset ADataset B
Minimum0.670.42
Maximum7180
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
2026-01-13T18:41:26.145643image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum0.670.42
5-th percentile33
Q11920
median2828.25
Q33839
95-th percentile52.360
Maximum7180
Range70.3379.58
Interquartile range (IQR)1919

Descriptive statistics

 Dataset ADataset B
Standard deviation14.19976115.582699
Coefficient of variation (CV)0.495050920.51716397
Kurtosis-0.134808910.15080241
Mean28.68343630.131061
Median Absolute Deviation (MAD)99.25
Skewness0.185105290.43158374
Sum10268.6710786.92
Variance201.63322242.82052
MonotonicityNot monotonicNot monotonic
2026-01-13T18:41:26.284321image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1817
 
3.8%
2816
 
3.6%
2215
 
3.4%
2513
 
2.9%
1913
 
2.9%
3612
 
2.7%
2011
 
2.5%
3011
 
2.5%
3910
 
2.2%
3110
 
2.2%
Other values (64)230
51.6%
(Missing)88
 
19.7%
ValueCountFrequency (%)
2417
 
3.8%
1915
 
3.4%
1814
 
3.1%
3014
 
3.1%
2813
 
2.9%
3513
 
2.9%
2113
 
2.9%
3111
 
2.5%
2611
 
2.5%
2210
 
2.2%
Other values (67)227
50.9%
(Missing)88
 
19.7%
ValueCountFrequency (%)
0.671
 
0.2%
0.751
 
0.2%
0.831
 
0.2%
0.921
 
0.2%
14
0.9%
27
1.6%
35
1.1%
46
1.3%
53
0.7%
61
 
0.2%
ValueCountFrequency (%)
0.421
 
0.2%
0.751
 
0.2%
0.831
 
0.2%
0.921
 
0.2%
14
0.9%
27
1.6%
34
0.9%
47
1.6%
62
 
0.4%
72
 
0.4%
ValueCountFrequency (%)
0.421
 
0.2%
0.751
 
0.2%
0.831
 
0.2%
0.921
 
0.2%
14
0.9%
27
1.6%
34
0.9%
47
1.6%
62
 
0.4%
72
 
0.4%
ValueCountFrequency (%)
0.671
 
0.2%
0.751
 
0.2%
0.831
 
0.2%
0.921
 
0.2%
14
0.9%
27
1.6%
35
1.1%
46
1.3%
53
0.7%
61
 
0.2%

SibSp
Real number (ℝ)

 Dataset ADataset B
Distinct77
Distinct (%)1.6%1.6%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean0.506726460.54035874
 Dataset ADataset B
Minimum00
Maximum88
Zeros293302
Zeros (%)65.7%67.7%
Negative00
Negative (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
2026-01-13T18:41:26.376334image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum00
5-th percentile00
Q100
median00
Q311
95-th percentile23
Maximum88
Range88
Interquartile range (IQR)11

Descriptive statistics

 Dataset ADataset B
Standard deviation0.935540251.0982862
Coefficient of variation (CV)1.84624312.0325131
Kurtosis13.66863515.794163
Mean0.506726460.54035874
Median Absolute Deviation (MAD)00
Skewness3.06525383.4290739
Sum226241
Variance0.875235551.2062327
MonotonicityNot monotonicNot monotonic
2026-01-13T18:41:26.440280image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0293
65.7%
1117
 
26.2%
216
 
3.6%
39
 
2.0%
48
 
1.8%
52
 
0.4%
81
 
0.2%
ValueCountFrequency (%)
0302
67.7%
1102
 
22.9%
216
 
3.6%
311
 
2.5%
410
 
2.2%
83
 
0.7%
52
 
0.4%
ValueCountFrequency (%)
0293
65.7%
1117
 
26.2%
216
 
3.6%
39
 
2.0%
48
 
1.8%
52
 
0.4%
81
 
0.2%
ValueCountFrequency (%)
0302
67.7%
1102
 
22.9%
216
 
3.6%
311
 
2.5%
410
 
2.2%
52
 
0.4%
83
 
0.7%
ValueCountFrequency (%)
0302
67.7%
1102
 
22.9%
216
 
3.6%
311
 
2.5%
410
 
2.2%
52
 
0.4%
83
 
0.7%
ValueCountFrequency (%)
0293
65.7%
1117
 
26.2%
216
 
3.6%
39
 
2.0%
48
 
1.8%
52
 
0.4%
81
 
0.2%

Parch
Real number (ℝ)

 Dataset ADataset B
Distinct66
Distinct (%)1.3%1.3%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean0.376681610.39013453
 Dataset ADataset B
Minimum00
Maximum55
Zeros339332
Zeros (%)76.0%74.4%
Negative00
Negative (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
2026-01-13T18:41:26.502690image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum00
5-th percentile00
Q100
median00
Q301
95-th percentile22
Maximum55
Range55
Interquartile range (IQR)01

Descriptive statistics

 Dataset ADataset B
Standard deviation0.797069760.76160179
Coefficient of variation (CV)2.11603041.9521517
Kurtosis9.1922995.615292
Mean0.376681610.39013453
Median Absolute Deviation (MAD)00
Skewness2.72217482.2146888
Sum168174
Variance0.63532020.58003729
MonotonicityNot monotonicNot monotonic
2026-01-13T18:41:26.565394image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0339
76.0%
163
 
14.1%
235
 
7.8%
34
 
0.9%
53
 
0.7%
42
 
0.4%
ValueCountFrequency (%)
0332
74.4%
165
 
14.6%
242
 
9.4%
34
 
0.9%
42
 
0.4%
51
 
0.2%
ValueCountFrequency (%)
0339
76.0%
163
 
14.1%
235
 
7.8%
34
 
0.9%
42
 
0.4%
53
 
0.7%
ValueCountFrequency (%)
0332
74.4%
165
 
14.6%
242
 
9.4%
34
 
0.9%
42
 
0.4%
51
 
0.2%
ValueCountFrequency (%)
0332
74.4%
165
 
14.6%
242
 
9.4%
34
 
0.9%
42
 
0.4%
51
 
0.2%
ValueCountFrequency (%)
0339
76.0%
163
 
14.1%
235
 
7.8%
34
 
0.9%
42
 
0.4%
53
 
0.7%

Ticket
['Text', 'Text']

 Dataset ADataset B
Distinct371383
Distinct (%)83.2%85.9%
Missing00
Missing (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
2026-01-13T18:41:26.892598image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length1818
Median length1717
Mean length6.76681616.8251121
Min length43

Characters and Unicode

 Dataset ADataset B
Total characters30183044
Distinct characters3532
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique313339 ?
Unique (%)70.2%76.0%

Sample

 Dataset ADataset B
1st row315094230434
2nd rowCA 21442223
3rd row250649347080
4th row127492627
5th row244252113781
ValueCountFrequency (%)
pc28
 
5.0%
c.a12
 
2.2%
a/58
 
1.4%
ston/o7
 
1.3%
27
 
1.3%
sc/paris6
 
1.1%
soton/o.q5
 
0.9%
w./c4
 
0.7%
ca4
 
0.7%
ston/o24
 
0.7%
Other values (390)473
84.8%
ValueCountFrequency (%)
pc36
 
6.2%
c.a14
 
2.4%
a/59
 
1.6%
ston/o6
 
1.0%
26
 
1.0%
ca6
 
1.0%
w./c5
 
0.9%
3826525
 
0.9%
f.c.c5
 
0.9%
174214
 
0.7%
Other values (403)481
83.4%
2026-01-13T18:41:27.228744image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3378
12.5%
1337
11.2%
2302
10.0%
7259
8.6%
4240
 
8.0%
0209
 
6.9%
6207
 
6.9%
5195
 
6.5%
9174
 
5.8%
8117
 
3.9%
Other values (25)600
19.9%
ValueCountFrequency (%)
3368
12.1%
1350
11.5%
2303
10.0%
7246
 
8.1%
4246
 
8.1%
6204
 
6.7%
0197
 
6.5%
5194
 
6.4%
9161
 
5.3%
8135
 
4.4%
Other values (22)640
21.0%

Most occurring categories

ValueCountFrequency (%)
(unknown)3018
100.0%
ValueCountFrequency (%)
(unknown)3044
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
3378
12.5%
1337
11.2%
2302
10.0%
7259
8.6%
4240
 
8.0%
0209
 
6.9%
6207
 
6.9%
5195
 
6.5%
9174
 
5.8%
8117
 
3.9%
Other values (25)600
19.9%
ValueCountFrequency (%)
3368
12.1%
1350
11.5%
2303
10.0%
7246
 
8.1%
4246
 
8.1%
6204
 
6.7%
0197
 
6.5%
5194
 
6.4%
9161
 
5.3%
8135
 
4.4%
Other values (22)640
21.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown)3018
100.0%
ValueCountFrequency (%)
(unknown)3044
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
3378
12.5%
1337
11.2%
2302
10.0%
7259
8.6%
4240
 
8.0%
0209
 
6.9%
6207
 
6.9%
5195
 
6.5%
9174
 
5.8%
8117
 
3.9%
Other values (25)600
19.9%
ValueCountFrequency (%)
3368
12.1%
1350
11.5%
2303
10.0%
7246
 
8.1%
4246
 
8.1%
6204
 
6.7%
0197
 
6.5%
5194
 
6.4%
9161
 
5.3%
8135
 
4.4%
Other values (22)640
21.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown)3018
100.0%
ValueCountFrequency (%)
(unknown)3044
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
3378
12.5%
1337
11.2%
2302
10.0%
7259
8.6%
4240
 
8.0%
0209
 
6.9%
6207
 
6.9%
5195
 
6.5%
9174
 
5.8%
8117
 
3.9%
Other values (25)600
19.9%
ValueCountFrequency (%)
3368
12.1%
1350
11.5%
2303
10.0%
7246
 
8.1%
4246
 
8.1%
6204
 
6.7%
0197
 
6.5%
5194
 
6.4%
9161
 
5.3%
8135
 
4.4%
Other values (22)640
21.0%

Fare
Real number (ℝ)

 Dataset ADataset B
Distinct179185
Distinct (%)40.1%41.5%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean31.95429635.230969
 Dataset ADataset B
Minimum00
Maximum512.3292512.3292
Zeros86
Zeros (%)1.8%1.3%
Negative00
Negative (%)0.0%0.0%
Memory size7.0 KiB7.0 KiB
2026-01-13T18:41:27.348981image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

 Dataset ADataset B
Minimum00
5-th percentile7.2257.225
Q17.89588.0344
median13.6458515.925
Q330.9239532.05935
95-th percentile110.38748120
Maximum512.3292512.3292
Range512.3292512.3292
Interquartile range (IQR)23.0281524.02495

Descriptive statistics

 Dataset ADataset B
Standard deviation51.26536954.455547
Coefficient of variation (CV)1.60433421.5456727
Kurtosis37.60107929.427809
Mean31.95429635.230969
Median Absolute Deviation (MAD)6.395858.675
Skewness5.15105634.5612181
Sum14251.61615713.012
Variance2628.13812965.4066
MonotonicityNot monotonicNot monotonic
2026-01-13T18:41:27.490407image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7.895823
 
5.2%
1321
 
4.7%
7.7518
 
4.0%
8.0517
 
3.8%
2616
 
3.6%
7.92512
 
2.7%
7.22929
 
2.0%
10.58
 
1.8%
08
 
1.8%
7.85428
 
1.8%
Other values (169)306
68.6%
ValueCountFrequency (%)
2619
 
4.3%
8.0519
 
4.3%
10.518
 
4.0%
1317
 
3.8%
7.895817
 
3.8%
7.7513
 
2.9%
26.559
 
2.0%
7.9258
 
1.8%
7.22928
 
1.8%
7.7758
 
1.8%
Other values (175)310
69.5%
ValueCountFrequency (%)
08
1.8%
4.01251
 
0.2%
6.23751
 
0.2%
6.49582
 
0.4%
6.751
 
0.2%
6.9751
 
0.2%
7.055
1.1%
7.05421
 
0.2%
7.1252
 
0.4%
7.2256
1.3%
ValueCountFrequency (%)
06
1.3%
6.451
 
0.2%
6.752
 
0.4%
6.9751
 
0.2%
7.04581
 
0.2%
7.053
 
0.7%
7.05421
 
0.2%
7.1254
0.9%
7.2257
1.6%
7.22928
1.8%
ValueCountFrequency (%)
06
1.3%
6.451
 
0.2%
6.752
 
0.4%
6.9751
 
0.2%
7.04581
 
0.2%
7.053
 
0.7%
7.05421
 
0.2%
7.1254
0.9%
7.2257
1.6%
7.22928
1.8%
ValueCountFrequency (%)
08
1.8%
4.01251
 
0.2%
6.23751
 
0.2%
6.49582
 
0.4%
6.751
 
0.2%
6.9751
 
0.2%
7.055
1.1%
7.05421
 
0.2%
7.1252
 
0.4%
7.2256
1.3%

Cabin
['Text', 'Text']

 Dataset ADataset B
Distinct8191
Distinct (%)84.4%81.2%
Missing350334
Missing (%)78.5%74.9%
Memory size7.0 KiB7.0 KiB
2026-01-13T18:41:27.806252image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

 Dataset ADataset B
Max length1515
Median length33
Mean length3.61458333.7232143
Min length11

Characters and Unicode

 Dataset ADataset B
Total characters347417
Distinct characters1818
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique6772 ?
Unique (%)69.8%64.3%

Sample

 Dataset ADataset B
1st rowB69C22 C26
2nd rowC78B35
3rd rowA32G6
4th rowB57 B59 B63 B66B86
5th rowD15G6
ValueCountFrequency (%)
b963
 
2.7%
b983
 
2.7%
c922
 
1.8%
c1262
 
1.8%
d362
 
1.8%
e332
 
1.8%
f42
 
1.8%
c932
 
1.8%
c222
 
1.8%
c262
 
1.8%
Other values (82)91
80.5%
ValueCountFrequency (%)
c223
 
2.2%
c263
 
2.2%
b963
 
2.2%
b983
 
2.2%
f3
 
2.2%
b352
 
1.5%
c232
 
1.5%
c252
 
1.5%
f332
 
1.5%
c272
 
1.5%
Other values (92)110
81.5%
2026-01-13T18:41:28.180642image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C40
11.5%
231
 
8.9%
B29
 
8.4%
128
 
8.1%
328
 
8.1%
627
 
7.8%
520
 
5.8%
919
 
5.5%
419
 
5.5%
17
 
4.9%
Other values (8)89
25.6%
ValueCountFrequency (%)
244
10.6%
C42
10.1%
B40
 
9.6%
635
 
8.4%
335
 
8.4%
127
 
6.5%
524
 
5.8%
23
 
5.5%
423
 
5.5%
822
 
5.3%
Other values (8)102
24.5%

Most occurring categories

ValueCountFrequency (%)
(unknown)347
100.0%
ValueCountFrequency (%)
(unknown)417
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
C40
11.5%
231
 
8.9%
B29
 
8.4%
128
 
8.1%
328
 
8.1%
627
 
7.8%
520
 
5.8%
919
 
5.5%
419
 
5.5%
17
 
4.9%
Other values (8)89
25.6%
ValueCountFrequency (%)
244
10.6%
C42
10.1%
B40
 
9.6%
635
 
8.4%
335
 
8.4%
127
 
6.5%
524
 
5.8%
23
 
5.5%
423
 
5.5%
822
 
5.3%
Other values (8)102
24.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown)347
100.0%
ValueCountFrequency (%)
(unknown)417
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
C40
11.5%
231
 
8.9%
B29
 
8.4%
128
 
8.1%
328
 
8.1%
627
 
7.8%
520
 
5.8%
919
 
5.5%
419
 
5.5%
17
 
4.9%
Other values (8)89
25.6%
ValueCountFrequency (%)
244
10.6%
C42
10.1%
B40
 
9.6%
635
 
8.4%
335
 
8.4%
127
 
6.5%
524
 
5.8%
23
 
5.5%
423
 
5.5%
822
 
5.3%
Other values (8)102
24.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown)347
100.0%
ValueCountFrequency (%)
(unknown)417
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
C40
11.5%
231
 
8.9%
B29
 
8.4%
128
 
8.1%
328
 
8.1%
627
 
7.8%
520
 
5.8%
919
 
5.5%
419
 
5.5%
17
 
4.9%
Other values (8)89
25.6%
ValueCountFrequency (%)
244
10.6%
C42
10.1%
B40
 
9.6%
635
 
8.4%
335
 
8.4%
127
 
6.5%
524
 
5.8%
23
 
5.5%
423
 
5.5%
822
 
5.3%
Other values (8)102
24.5%

Embarked
Categorical

 Dataset ADataset B
Distinct33
Distinct (%)0.7%0.7%
Missing11
Missing (%)0.2%0.2%
Memory size7.0 KiB7.0 KiB
S
326 
C
83 
Q
36 
S
323 
C
84 
Q
38 

Length

 Dataset ADataset B
Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

 Dataset ADataset B
Total characters445445
Distinct characters33
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Dataset ADataset B
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Dataset ADataset B
1st rowSS
2nd rowSS
3rd rowSS
4th rowSC
5th rowSS

Common Values

ValueCountFrequency (%)
S326
73.1%
C83
 
18.6%
Q36
 
8.1%
(Missing)1
 
0.2%
ValueCountFrequency (%)
S323
72.4%
C84
 
18.8%
Q38
 
8.5%
(Missing)1
 
0.2%

Length

2026-01-13T18:41:28.265707image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Dataset A

2026-01-13T18:41:28.322160image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:28.362334image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
s326
73.3%
c83
 
18.7%
q36
 
8.1%
ValueCountFrequency (%)
s323
72.6%
c84
 
18.9%
q38
 
8.5%

Most occurring characters

ValueCountFrequency (%)
S326
73.3%
C83
 
18.7%
Q36
 
8.1%
ValueCountFrequency (%)
S323
72.6%
C84
 
18.9%
Q38
 
8.5%

Most occurring categories

ValueCountFrequency (%)
(unknown)445
100.0%
ValueCountFrequency (%)
(unknown)445
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
S326
73.3%
C83
 
18.7%
Q36
 
8.1%
ValueCountFrequency (%)
S323
72.6%
C84
 
18.9%
Q38
 
8.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown)445
100.0%
ValueCountFrequency (%)
(unknown)445
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
S326
73.3%
C83
 
18.7%
Q36
 
8.1%
ValueCountFrequency (%)
S323
72.6%
C84
 
18.9%
Q38
 
8.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown)445
100.0%
ValueCountFrequency (%)
(unknown)445
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
S326
73.3%
C83
 
18.7%
Q36
 
8.1%
ValueCountFrequency (%)
S323
72.6%
C84
 
18.9%
Q38
 
8.5%

Interactions

Dataset A

2026-01-13T18:41:22.036965image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:24.117583image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2026-01-13T18:41:20.743516image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:22.800371image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2026-01-13T18:41:21.040895image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:23.086251image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2026-01-13T18:41:21.340140image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:23.393711image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2026-01-13T18:41:21.750211image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:23.829188image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2026-01-13T18:41:22.094450image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:24.174378image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2026-01-13T18:41:20.802748image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:22.853398image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2026-01-13T18:41:21.100167image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:23.146309image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2026-01-13T18:41:21.502158image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:23.454865image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2026-01-13T18:41:21.805365image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:23.884892image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2026-01-13T18:41:22.153764image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:24.234505image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2026-01-13T18:41:20.864785image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:22.915838image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2026-01-13T18:41:21.161793image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:23.210639image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2026-01-13T18:41:21.560030image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:23.515807image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2026-01-13T18:41:21.865133image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:23.947363image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2026-01-13T18:41:22.215699image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:24.296836image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2026-01-13T18:41:20.928887image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:22.976109image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2026-01-13T18:41:21.221307image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:23.272087image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2026-01-13T18:41:21.625280image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:23.583382image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2026-01-13T18:41:21.925198image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:24.007380image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2026-01-13T18:41:22.273396image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:24.352405image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2026-01-13T18:41:20.984616image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:23.031397image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2026-01-13T18:41:21.280709image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:23.331609image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2026-01-13T18:41:21.685771image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:23.645078image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

2026-01-13T18:41:21.980921image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:24.062489image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Correlations

Dataset A

2026-01-13T18:41:28.412269image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset B

2026-01-13T18:41:28.515657image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Dataset A

AgeEmbarkedFareParchPassengerIdPclassSexSibSpSurvived
Age1.0000.0000.173-0.2130.0860.2990.123-0.1910.242
Embarked0.0001.0000.1700.0450.0330.2510.1170.0000.165
Fare0.1730.1701.0000.396-0.0040.5380.1660.4530.298
Parch-0.2130.0450.3961.000-0.0380.0460.2120.4350.227
PassengerId0.0860.033-0.004-0.0381.0000.0000.000-0.0930.124
Pclass0.2990.2510.5380.0460.0001.0000.1080.1520.340
Sex0.1230.1170.1660.2120.0000.1081.0000.1410.544
SibSp-0.1910.0000.4530.435-0.0930.1520.1411.0000.171
Survived0.2420.1650.2980.2270.1240.3400.5440.1711.000

Dataset B

AgeEmbarkedFareParchPassengerIdPclassSexSibSpSurvived
Age1.0000.0420.094-0.2910.0170.2630.075-0.1730.091
Embarked0.0421.0000.2130.0720.0000.2870.0760.1160.108
Fare0.0940.2131.0000.4210.0010.4670.1800.4500.259
Parch-0.2910.0720.4211.0000.0160.0000.2500.4470.121
PassengerId0.0170.0000.0010.0161.0000.0000.000-0.0210.139
Pclass0.2630.2870.4670.0000.0001.0000.0900.1420.346
Sex0.0750.0760.1800.2500.0000.0901.0000.2540.519
SibSp-0.1730.1160.4500.447-0.0210.1420.2541.0000.209
Survived0.0910.1080.2590.1210.1390.3460.5190.2091.000

Missing values

Dataset A

2026-01-13T18:41:22.367711image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.

Dataset B

2026-01-13T18:41:24.444419image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.

Dataset A

2026-01-13T18:41:22.447589image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Dataset B

2026-01-13T18:41:24.521035image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Dataset A

2026-01-13T18:41:22.535161image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Dataset B

2026-01-13T18:41:24.600800image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Dataset A

PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
72572603Oreskovic, Mr. Lukamale20.00003150948.6625NaNS
717203Goodwin, Miss. Lillian Amyfemale16.0052CA 214446.9000NaNS
75575612Hamalainen, Master. Viljomale0.671125064914.5000NaNS
82082111Hays, Mrs. Charles Melville (Clara Jennings Gregg)female52.00111274993.5000B69S
85485502Carter, Mrs. Ernest Courtenay (Lilian Hughes)female44.001024425226.0000NaNS
24524601Minahan, Dr. William Edwardmale44.00201992890.0000C78Q
68468502Brown, Mr. Thomas William Solomonmale60.00112975039.0000NaNS
66566602Hickman, Mr. Lewismale32.0020S.O.C. 1487973.5000NaNS
636403Skoog, Master. Haraldmale4.003234708827.9000NaNS
40240303Jussila, Miss. Mari Ainafemale21.001041379.8250NaNS

Dataset B

PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
44344412Reynaldo, Ms. Encarnacionfemale28.00023043413.0000NaNS
83483503Allum, Mr. Owen Georgemale18.00022238.3000NaNS
42342403Danbom, Mrs. Ernst Gilbert (Anna Sigrid Maria Brogren)female28.01134708014.4000NaNS
49549603Yousseff, Mr. GeriousmaleNaN00262714.4583NaNC
49849901Allison, Mrs. Hudson J C (Bessie Waldo Daniels)female25.012113781151.5500C22 C26S
64164211Sagesser, Mlle. Emmafemale24.000PC 1747769.3000B35C
18418513Kink-Heilmann, Miss. Luise Gretchenfemale4.00231515322.0250NaNS
85685711Wick, Mrs. George Dennick (Mary Hitchcock)female45.01136928164.8667NaNS
535412Faunthorpe, Mrs. Lizzie (Elizabeth Anne Wilkinson)female29.010292626.0000NaNS
65765803Bourke, Mrs. John (Catherine)female32.01136484915.5000NaNQ

Dataset A

PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
32232312Slayter, Miss. Hilda Maryfemale30.00023481812.3500NaNQ
69869901Thayer, Mr. John Borlandmale49.01117421110.8833C68C
20220303Johanson, Mr. Jakob Alfredmale34.00031012646.4958NaNS
24824911Beckwith, Mr. Richard Leonardmale37.0111175152.5542D35S
53053112Quick, Miss. Phyllis Mayfemale2.0112636026.0000NaNS
38538602Davies, Mr. Charles Henrymale18.000S.O.C. 1487973.5000NaNS
16816901Baumann, Mr. John DmaleNaN00PC 1731825.9250NaNS
44444513Johannesen-Bratthammer, Mr. BerntmaleNaN00653068.1125NaNS
72372402Hodges, Mr. Henry Pricemale50.00025064313.0000NaNS
14414502Andrew, Mr. Edgardo Samuelmale18.00023194511.5000NaNS

Dataset B

PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
13013103Drazenoic, Mr. Jozefmale33.0003492417.8958NaNC
18718811Romaine, Mr. Charles Hallace ("Mr C Rolmane")male45.00011142826.5500NaNS
56656703Stoytcheff, Mr. Iliamale19.0003492057.8958NaNS
29930011Baxter, Mrs. James (Helene DeLaudeniere Chaput)female50.001PC 17558247.5208B58 B60C
79879903Ibrahim Shawah, Mr. Yousseffmale30.00026857.2292NaNC
555611Woolner, Mr. HughmaleNaN001994735.5000C52S
79579602Otter, Mr. Richardmale39.0002821313.0000NaNS
121303Saundercock, Mr. William Henrymale20.000A/5. 21518.0500NaNS
86987013Johnson, Master. Harold Theodormale4.01134774211.1333NaNS
76576611Hogeboom, Mrs. John C (Anna Andrews)female51.0101350277.9583D11S

Duplicate rows

Dataset A

PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked# duplicates
Dataset does not contain duplicate rows.

Dataset B

PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked# duplicates
Dataset does not contain duplicate rows.